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Abstract 

Recent studies suggested a role for the human endogenous retrovirus (HERV) group HERV-K(HML-2) in melanoma because of 
upregulated transcription and expression of HERV-K(HML-2)-encoded proteins. Very little is known about which HML-2 loci are 
transcribed in melanoma. We assigned > 1 ,400 HML-2 cDNA sequences generated from various melanoma and related samples to 
genomic HML-2 loci, identifying a total of 23 loci as transcribed. Transcription profiles of loci differed significantly between samples. 
One locus was found transcribed only in melanoma-derived samples but not in melanocytes and might represent a marker for 
melanoma. Several of the transcribed loci harbor ORFs for retroviral Gag and/or Env proteins. Env-encoding loci were transcribed only 
in melanoma. Specific investigation of recand np9 transcripts indicated transcription of protein encoding loci in melanoma and 
melanocytes hinting at the relevance of Rec and Np9 in melanoma. UVB irradiation changed transcription profiles of loci and overall 
transcript levels decreased in melanoma and melanocytes. We further identified transcribed HML-2 loci formed by reverse transcrip- 
tion of spliced HML-2 transcripts by L1 machinery or in a retroviral fashion, with loci potentially encoding HML-2-like proteins. We 
reveal complex, sample-specific transcription of HML-2 loci in melanoma and related samples. Identified HML-2 loci and proteins 
encoded by those loci are particularly relevantforfurther studying the role of HML-2 in melanoma. Transcription of HE RVs appears as a 
complex mechanism requiring specific studies to elucidate which HERV loci are transcribed and how transcribed HERVs may be 
involved in disease. 
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Introduction 

Approximately 8% of the human genome mass are of direct 
or indirect retroviral origin. So-called human endogenous 
retroviruses (HERVs) are remnants of ancient retroviruses 
that infected the genome of germ line cells of human ancestral 
species millions of years ago. Quite a number of phylogenet- 
ically distinct retroviruses invaded the germ line and thus left 
their traces in the human genome. Approximately 40 different 
HERV groups were previously cataloged in the human genome 
sequence. (The term "group" is favorable over the formerly 
used term "family" because "family" is reserved for 
Retroviridae [Blomberg et al. 2009; Mayer et al. 2011; 



Stoye 2012].) Some of those HERV groups significantly ampli- 
fied in copy numbers, sometimes up to several thousand 
copies, following initial germ line invasion by reinfection or 
intracellular formation of new copies. HERV loci consist of 
so-called proviruses harboring retroviral Gag, Protease, 
Polymerase, and Envelope-coding sequences and flanked by 
long terminal repeats (LTRs). HERV loci are often significantly 
mutated due to long time presence in the genome, displaying 
deletions of variable sizes and numerous other nonsense mu- 
tations. Some of the mutated HERV loci could also serve as 
template for new copies. Proviral loci also became often re- 
duced to so-called solitary LTRs by homologous recombination 
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between the two proviral LTRs leaving a single LTR behind 
(for reviews, see Mager and Medstrand 2003; Bannert and 
Kurth 2006; Ruprecht, Mayer, et al. 2008; Stoye 2012). 

HERVs influence the human genome and human biology. 
Although coding capacity for original retroviral proteins has 
been impaired by nonsense mutations for most HERVs, many 
HERV loci nevertheless remain transcriptionally active and a 
number of studies identified HERV transcripts from different 
HERV groups in various tissue and cell types. Transcriptional 
activity of HERV loci differs between cell and tissue types and is 
therefore regulated in some way (for instance, see Yin et al. 
1997; Buscher et al. 2005; Frank et al. 2005, 2008; Seifarth 
et al. 2005; Hu et al. 2006; Muradrasoli et al. 2006; Oja et al. 
2007; Haupt et al. 201 1 ; Perot et al. 2012). Several instances 
of transcriptionally active HERV loci influencing neighboring 
genes by providing (alternative) promoters have been docu- 
mented. HERV loci can furthermore affect transcripts of cellu- 
lar genes by providing alternative splice and polyadenylation 
signals (reviewed in Cohen et al. 2009). 

Some HERV groups have biological relevance in humans. 
The syncytin-1/ERVWE1 gene is, in essence, a proviral locus of 
the HERV-W group. The Env protein of that HERV-W locus, 
named Syncytin-1 , has been selected during evolution for con- 
tributing to fusion of cell membranes of trophoblasts to form 
syncytiotrophoblasts, which is an essential process during 
human placenta formation (Mallet et al. 2004; Dupressoir 
et al. 2012). The HERV-K(HML-2) group, in short, HML-2, is 
exceptional regarding evolution, coding capacity and potential 
clinical involvement. Besides evolutionarily older loci, the 
HML-2 group comprises younger proviruses, several of them 
being human specific and thus having formed after the evo- 
lutionary split of human from chimpanzee, resulting in pres- 
ence/absence of alleles of HML-2 loci (Reus et al. 2001; 
Hughes and Coffin 2004; Macfarlane and Simmonds 2004; 
Belshaw et al. 2005). Evolutionarily young HML-2 proviruses 
often encode one or several of the former retroviral proteins 
(Kitamura et al. 1 996; Schommer et al. 1 996; Barbulescu et al. 
1 999; Mayer et al. 1 999; Tonjes et al. 1 999). HML-2-encoded 
Rec and Np9 proteins, the latter being encoded by so-called 
HML-2 type I proviruses lacking a 292-bp sequence portion 
within the pol-env boundary, interact with cellular proteins 
such as promyelocytic leukemia zinc finger protein, ligand of 
numb protein X, testicular zinc-finger protein, androgen re- 
ceptor, and family with sequence similarity 21 (FAM21) and 
might thus be involved in tumorgenesis (Boese et al. 2000; 
Armbruester et al. 2004; Kramer-Hammerle et al. 2005; 
Kaufmann et al. 2010). Nude mice transgenic for the nuclear 
export factor Rec display disturbed germ cell development and 
histological lesions reminiscent of carcinoma in situ (Galli et al. 
2005). We recently identified a novel HML-2-encoded protein, 
Env-SP, a stable signal peptide resulting from processing of 
HML-2 Env, which is very similar in sequence to Rec protein, 
but displays several biological features different from Rec 
(Ruggieri et al. 2009). 



HML-2 expression has been found upregulated in several 
tumor types, among them germ cell tumors (GCTs). GCT pre- 
cursor lesions (carcinoma in situ) already display high amounts 
of HML-2 RNA (Herbst et al. 1996), and patients with GCT 
display high antibody titers against HML-2-encoded Gag and 
Env proteins already at the time of tumor detection (Sauter 
et al. 1995, 1996; Boiler et al. 1997). Previous studies also 
suggested a role of HML-2 in the etiology of melanoma, the 
most malignant type of skin cancer with increasing incidence 
worldwide that arises from uncontrolled proliferation of 
pigment-producing melanocytes in skin, mucosa, or uvea 
(Dennis 1999). Some melanoma cell lines were shown to pro- 
duce retrovirus-like particles (RVLPs) with reverse transcriptase 
activity that contain HML-2-like sequences and Gag and Env 
protein (Muster et al. 2003). Those RVLPs were subsequently 
found to be defective and noninfectious (Buscher et al. 2005). 
Expression of HML-2 was reported as elevated in melanoma, 
which may be due to increased promotor activity and demeth- 
ylation of the HML-2 proviral 5^-LTR (Stengel et al. 2010). 
HML-2 RNA, as well as Gag, Env, Rec, and Np9 proteins, 
was identified in melanoma tissue and melanoma cell lines 
(Buscher et al. 2005, 2006; Reiche et al. 2010). HERV-K-spe- 
cific antibodies were found in melanoma patients, and sero- 
logical response against HML-2 Gag and Env protein was 
furthermore reported and correlated with survival probability 
(Buscher et al. 2005; Hahn et al. 2008). Adherent melanoma 
cells were shown to undergo a transition to a more malignant, 
nonadherent phenotype when exposed to stress, accompa- 
nied by HERV-K(HML-2) expression (Serafino et al. 2009). 
Specific activation of HML-2 expression was furthermore 
reported in response to UV radiation, the most established 
environmental risk factor in melanoma development 
(Schanab et al. 2011). 

However, it still remains uncertain whether HML-2 retrovi- 
ral sequences are activated as a corollary of malignant trans- 
formation or if they might actively participate in tumor 
development and the ability of melanoma cells to escape 
immunosurveillance. 

On the basis of the description of overall transcript levels of 
HERV groups in cells and tissues, we and others recently 
established strategies for identifying those HERV loci that ac- 
tually contribute to HERV-group-specific RNAs. If HERV RNA 
or HERV-encoded proteins are of biological relevance, solely 
the transcribed HERV loci should obviously be of relevance and 
their identification is therefore of primary interest. A recent 
microarray-based approach aims at identifying transcribed 
HERV loci by a collection of locus-specific oligoncleotides 
(Perot et al. 2012). We recently established a strategy that 
involves generation of HERV-specific cDNA sequences by 
RT-PCR followed by cloning and sequencing of RT-PCR prod- 
ucts. HERV cDNA sequences can then be assigned to genomic 
HERV loci by sequence comparisons (Flockerzi et al. 2008). We 
thus obtained transcription profiles of HML-2 loci and loci of 
other HERV groups in various tissue and cell types, with 
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transcription levels of specific HERV loci differing significantly 
between exannined tissues and cell types (Flockerzi et al. 2008; 
Frank et al. 2008; Ruprecht et al. 2008; Laufer et al. 2009). 

To contribute to a better understanding of the biological 
role of HML-2 in melanonna, we set out to identify HML-2 loci 
transcribed in nnelanoma. We generated HML-2 cDNA 
sequences from nnelanoma tissue, melanoma cell lines, and 
melanocyte cell lines. We derived transcription profiles for 
HML-2 loci and correlated transcription of HML-2 loci with 
provirus structures and coding capacity for Gag, Env, Rec, 
and Np9 proteins. We also quantified HML-2 transcript 
levels in the various samples and examined differences in tran- 
scription profiles following UV irradiation. Finally, we analyzed 
in more detail transcribed HML-2 loci that appeared to be 
spliced and subsequently retrotransposed HML-2 messenger 
RNAs (mRNAs). 



Materials and Methods 

Melanoma Cell Lines and Human Tissue Samples 

Noncommercial WM3734 human melanoma cell lines were 
obtained in accordance with consent procedures approved by 
the Internal Review Boards of the University of Pennsylvania 
School of Medicine and The Wistar Institute. WM3734a 
(BRAFV600E, NRAS wt) were isolated and routinely main- 
tained in Tu2% medium, consisting of 80% [v/v] MCDB 
153 Basal Medium (Biochrom, Berlin, Germany) and 20% 
[v/v] L-15 Leibovitz Medium (Biochrom, Berlin, Germany) sup- 
plemented with 2% [v/v] FBS, 1 .68 mM CaCb, and 2.5 ng/ml 
insulin, as described previously (Satyamoorthy et al. 1997). 
SK-Mel-25, SK-Mel-28 (BRAFV600E, CDK4R24C) (Muster 
et al. 2003; Buscher et al. 2005, 2006; Stengel et al. 2010), 
and MEWO (Buscher et al. 2005; Reiche et al. 2010) cells 
were cultured in RPMI 1640 medium (Gibco, Life 
Technologies, Carlsbad, CA) supplemented with 10% [v/v] 
FCS. Melanocyte cell lines Benno and Oskar were cultured in 
Melanocyte Growth Medium (PromoCell, Heidelberg, 
Germany). All cells were cultured at 37 °C in a humidified 
5% [v/v] CO2 atmosphere. The consistency of cellular geno- 
types was confirmed by DNA fingerprinting at the 
Department of Human Genetics of The Saarland University 
Hospital. Human melanoma tissues were obtained in accor- 
dance with consent procedures approved by the ethical 
review committee of the Saarland state chamber of physicians 
(No. 178/1 1). Melanoma lymph node metastases were stored 
at -80 °C immediately following surgical removal at the 
Department of Dermatology, Medical Faculty, University 
Clinic of Saarland, Homburg, Germany. Total RNAs from ma- 
lignant melanoma metastases were obtained from Cambridge 
Bioscience (Cambridge, UK). Samples contained at least 90% 
tumor tissue were derived from jejunum (cat# CR562868), 
lung (cat# CR562901), and lymph node (cat# CR561386). 



Minimum stage grouping for samples CR562868 and 
CR562901 was IV and NIB for sample CR561386. 

RNA Isolation, Generation of cDNA, and PCR 

RNA was isolated from cells and metastases using the RNeasy 
Mini Kit (Qiagen, Hilden, Germany) following the manufac- 
turer's instructions. Cultured cells were pelleted by centrifuga- 
tion for 5min at 300 x g, resuspended and applied to a 
QIAshredder column (Qiagen, Hilden, Germany), according 
to the manufacturer's instructions, before application 
to an RNeasy Mini spin column. Tissue samples from metas- 
tases were homogenized in cold Trizol (Invitrogen, Life 
Technologies, Carlsbad, CA), applied to a QIAshredder 
column (Qiagen, Hilden, Germany), centrifuged at 
1 5,000 xg for 2min, and incubated at 30 °C for 5min. 
After addition of 0.4 ml chloroform, samples were mixed by 
vortexing for 40s, incubated at 30 °C for 3min, and centri- 
fuged at 8,600 X g and 4°C for 20 min. The upper phase was 
mixed with 1 vol. 70% [v/v] ethanol and applied to an RNeasy 
Mini spin column. To remove traces of genomic DNA, isolated 
RNA was treated with Turbo DNA-freeTM Kit (Applied 
Biosystems, Foster City, CA). RNA (10|ig) was treated in a 
50 [i\ reaction, according to the manufacturer's instructions. 
Removal of DNA was verified by Alu-element-specific PCR 
(Klein et al. 1993) and further controls in subsequent experi- 
ments (see below). 

cDNA was generated using the Omniscript RT Kit (Qiagen, 
Hilden, Germany) and 10|aM of random hexanucleotide pri- 
mers in a 30 [i\ reaction, following the manufacturer's instruc- 
tions. 6[i\ of the DNase reaction containing 1.2|ig RNA and 
random primers were combined, denatured at 70 °C for 
5 min, and cooled to room temperature to allow primer 
annealing. Two reactions were prepared for each RNA 
sample, one of them serving as RT(-) negative control. A mas- 
termix was prepared with buffer, dNTPs, and RNase inhibitor, 
which was distributed to all RNA samples. RT enzyme was 
subsequently added to the RT reactions and an equal 
amount of water to corresponding RT(-) controls. 

For subsequent gag-specific PCR of cDNA, primers were 
optimized toward amplification of as many loci as possible 
in one PCR. An alignment of all genomic HML-2 loci was 
generated using BLAT results of the HML-2 consensus se- 
quence obtained from RepBase. Coordinates and sequences 
of complete loci including LTRs, if present, were assembled 
using the UCSC table browser (Karolchik et al. 2004). 
Combinations of four forward and three reverse primers, re- 
flecting nucleotide differences in the primer binding regions of 
the various HERV-K(HML-2) target sequences, were used to 
amplify an approximately 620-bp portion of the HML-2 
gag gene (nt 1,778-2,396 in the HERV-K(HML-2.H0M) refer- 
ence sequence, GenBank accession no. AF074086.2). Foward 
primers HML-2_1 778_for1 : 5^-CCCCCAGAAAGTCAGTAT 
GGA-3^; HML-2_1 778_for2: 5^-TCTCCAGAGGTTCAGTAT 
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GGA-3^; HML-2_1778_for3: 5^-CCCCCAGAAAATCAGTAT 
GGA-3^; and HML-2_1778_for4: 5^-TCTCCAGAGGTGCAGT 
ATAGA-3^ were combined in a 10:3:2:1 ratio. Reverse 
primers HML-2_2396_rev1 : 5^-TTTCCCAGGCTCTAAGGC 
AG-3^; HML-2_2396_rev2: 5^-TTCCCAGGCCCTGAGGCA 
A-3^; and HML-2_2396_rev3: 5^-TTTCCTAGGCTCTAAGGC 
AG-3^ were combined in a 12:6:1 ratio, corresponding 
roughly to the number of potential targets of each primer 
variant. The PGR mix contained forward and reverse primer 
mixes at 0.5 |iM each, 1 U recombinant Taq DNA polymerase 
(Invitrogen, Life Technologies, Carlsbad, CA), I.SmM MgCI2, 
0.2 mM of each dNTP, 1 x PGR buffer, and 1 [il of cDNA in a 
20|il reaction. A water control was included, as well as a 
negative control for each sample, containing 1 |il of the re- 
spective RT(-) reaction as template. Cycling conditions were as 
follows: initial denaturation 3min at 95 °C, 40 cycles 50 s at 
95°C, 50s at 53°C, 1 min at 72°C, and final elongation 
lOmin at 72 °C. The low-stringency annealing temperature 
should contribute to amplification of HML-2 loci for which 
primers do not match perfectly. For amplification of the rec/ 
np9 region, primers spanning nt 6,451-8,728 in the 
HERV-K(HML-2.H0M) reference sequence were used, gener- 
ating products of approximately 370 bp for np9 and approx- 
imately 580 bp for rec. The forward primers np9-F0R-1 : 5^ -AT 
GAACCCATCAGAGATGCAA-3^; np9-FOR-2: 5^-ATGAATCCA 
TCAGAGATGCAA-3^; np9-FOR-3: 5^-GCGAACCCTTCAGAGA 
TGCAA-3^; and np9-FOR-4: 5^-ATGAACCCATCGGAGATG 
AAA-3^ were combined in a 17:1:1:1 ratio. Reverse 
primers np9-REV-1: 5^-AGCATCTGTTTAACAAAGCA-3^ and 
np9-REV-2: 5^-AGCATGTTTAACAAAGCA-3^ were combined 
in a 19:1 ratio. PGR mix, controls, and cycling conditions 
were the same as for the gag-amplicon except an annealing 
temperature of 54°G. gag and rec/np9 RT-PGR products were 
separated by agarose gelelectrophoresis and purified using 
the QIAquick Gel Extraction Kit (Qiagen, Hilden, Germany) 
following the manufacturer's instructions. RT-PGR products 
were subsequently cloned into pGEM-T® Easy vector 
(Promega, Fitchburg, Wl), and ligations were transformed 
into chemocompetent £ coli DH5a cells. Insert-containing 
clones were identified by colony PGR using vector-specific 
Ml 3 primers. Plasmid DNA of positive clones was isolated in 
a 96-well format using the Agencourt GosMGPrep Kit 
(Beckman Goulter Genomics, Danvers, MA). Sequences of 
cloned RT-PGR products were generated by Seq-IT GmbH 
(Kaiserslautern, Germany) using vector-specific T7 primer 
and an Applied Biosystems 3730 DNA-Analyzer. 

Assignment of cDNA Sequences to Genomic HML-2 Loci 

Quality of generated cDNA sequences was checked by eye 
using FinchTV (Geospiza Inc., PerkinElmer, Seattle, WA), and 
poor quality reads were excluded from further analysis. Vector 
and PGR primer sequence portions were removed from cDNA 
sequences using Geneious software (Biomatters Ltd., 



Auckland, New Zealand). Trimmed cDNA sequences were as- 
signed to genomic HML-2 loci by BLAT searching the human 
NGBI36/hg18 reference genome sequence at the UGSG 
Genome Browser (Kent et al. 2002). An assignment was de- 
fined as unambiguous if there was only one best match with 
less than six mismatches to the corresponding genomic se- 
quence and the second best match displaying at least one 
more mismatch. Sequences with more than six mismatches 
were excluded from the analysis, allowing up to 1 % differ- 
ence in sequence due to RT-PGR and sequencing errors and 
interindividual differences. Unlike K115, the polymorphic 
K113 provirus in 19p12 is not annotated in the Genome 
Browser assembly NGBI36/hg18 and could thus not be iden- 
tified by our BLAT search. The investigated amplicon region of 
K113 matches that of the HML-2 locus in 1q22 with one 
nucleotide difference. The relevant nucleotide position of all 
sequences mapping to 1q22 was thus checked manually to 
exclude that they are actually derived from K113. ORFs for 
retroviral proteins Gag, Env, Rec, and Np9 were predicted 
using Geneious software (Biomatters Ltd., Auckland, New 
Zealand). In case of Rec and Np9, untrimmed sequences in- 
cluding the primer binding region were used, as the ATG start 
codon is part of the forward primer used. Sequence identities 
and similarities were calculated using SIAS (http://imed.med. 
ucm.es/rools/sias.html, last accessed January 31, 2013). 

UV Irradiation of Cultured Melanoma Cells 

Cells were grown to approximately 70-80% confluency. 
Following the experimantal strategy as described by 
Schanab et al. (2011), cell culture medium was discarded, 
and cells were washed with 1 x DPBS (Gibco, Life 
Technologies, Carlsbad, CA). DPBS was removed leaving a 
small amount of fluid to keep cells moist, while cells were 
irradiated with 200mJ/cm^ UVB using a UV409T ultraviolet 
lamp (Waldmann, Villingen-Schwenningen, Germany). The 
fluence rate of the lamp at the site of irradiation was mea- 
sured using a UV-Meter Vario Control (Waldmann, 
Villingen-Schwenningen, Germany). Cells were subsequently 
cultured in fresh media at 37°C/5% CO2 [v/v]. Cells were 
harvested 24 h later, and total RNA was isolated as described 
earlier. 

qRT-PGR 

HML-2 transcription was quantified by Real-Time PGR using 
the KAPATM SYBR® FAST qPGR MasterMix Universal Kit 
(Peqiab, Eriangen, Germany) and the StepOnePlusTM 
Real-Time PGR system (Applied Biosystems, Foster City, CA). 
The PGR mix contained 1x KAPATM SYBR® FAST qPGR 
MasterMix Universal, 10|iM of forward and reverse primer 
each, 1 X ROX Reference Dye High, and 1 |il of a 1 : 1 0 dilution 
of cDNA or RT(-) control in a 1 0 |al reaction. Cycling conditions 
were as follows. Cycling stage: 1 min at 95 °C, 40 cycles of 
3 s at 95 °C, 30s at 55 °C, and 30s at 72 °C. Melt curve 
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Stage: 15s 95°C, 1 min 60 °C, and 15s 95 °C with reporter 
signal measured every 0.3 °C during the second 95 °C rannp. 
For HML-2 gag-specific annplifications, above-described PGR 
primer mixes were used. HML-2 transcript levels were normal- 
ized to transcript levels of housekeeping genes G6PDH (for- 
ward primer: 5^-ATCGACCACTACCTGGGCAA-3^ reverse 
primer: 5^-TTCTGCATCACGTCCCGGA-30 and RPII (forward 
primer: 5^-GCACCACGTCCAATGACAT-3^ reverse primer: 
5^-GTGCGGCTGCTTCCATAA-30 (Radonic et al. 2004). For 
each amplicon, cDNA samples were measured in triplicate 
and the RT(-) reaction of each sample was measured once. 
A water control was included for each amplicon. StepOneTM 
Software version 2.2.2 (Applied Biosystems, Foster City, CA) 
was employed for normalization and further analysis of mea- 
sured cDNA levels. 

Testing Employed Cell Lines for Polymorphic HML-2 Loci 

Presence of HML-2 loci previously described to be polymorphic 
in the population was examined by long-PCR using primers 
flanking the HML-2 locus. For the 1p31.1 locus that was 
formed within LI -elements, a reliable long-PCR strategy 
could not be obtained. Instead, a second primer located in 
the gag portion of the proviral sequence was used. 
Primer sequences are listed in supplementary table SI, 
Supplementary Material online. The PCR mix contained 
0.5 |iM of forward and reverse primer each, 200 |iM of each 
dNTP, 1 X buffer HF, 0.4 U Phusion® High-Fidelity DNA poly- 
merase (Finnzymes, Thermo Fisher Scientific, Waltham, MA), 
and lOOng genomic DNA in a 20 [i\ reaction. The cycling 
conditions were as follows: initial denaturation 30 s at 98 °C, 
35 cycles of 10s at 98 °C, 30s at 62 °C and 5 min at 72 °C, 
and final elongation 10 min at 72 °C. Presence was verified in 
a second PCR using the forward primer and a second primer 
located in the HML-2 LTR. Primer mix and cycling conditions 
were as described above except an elongation time of 45 s. 

Results 

Upregulated HERV-K(HML-2) transcription was reported for 
melanoma, and an involvement of HML-2-encoded proteins 
in melanoma development was considered before (Buscher 
et al. 2005, 2006; Reiche et al. 2010; Schanab et al. 201 1). 
Transcribed HML-2 loci may be regarded as of greater biolog- 
ical relevance than nontranscribed loci. We identified in this 
study HML-2 loci being transcribed in melanoma and thus 
actually generating the HML-2 RNA detected in previous 
studies. 

Strategy for Identification of Transcribed HML-2 Loci 

To identify transcribed HML-2 loci in melanoma, we per- 
formed HML-2-specific RT-PCR on total RNA isolated from 
various melanoma-derived cell lines, melanoma tumor sam- 
ples, and melanocyte cell lines followed by assignment of 



cDNA sequences to particular HML-2 loci. HML-2-specific 
PCR primers were located in the central portion of the 
HML-2 gag gene (fig. la). Primers were optimized toward 
amplification of as many loci as possible in one PCR. A com- 
plete alignment of genomic HML-2 loci was generated as de- 
scribed in the Materials and Methods section. The alignment 
comprised a total of 80 HML-2 loci, including the polymorphic 
proviruses HERV-K113 and HERV-K115. Thirty-six HML-2 loci 
were close to full length with internal sequences >7,000 bp 
(supplementary fig. SI, Supplementary Material online). For 
cDNA amplification, a region of approximately 620 bp in the 
central portion of the gag gene was chosen (nt 1,778-2,396 
in the HERV-K(HML-2.H0M) reference sequence [GenBank 
accession no. AF074086.2]) that is present in 35 (97.22%) 
of the full-length loci, including HERV-K113 and HERV- 
K115, and 13 shorter loci with internal sequences between 
3,889 and 6,579 bp. We chose this gag region because it 
shows high sequence similarity in the primer regions but still 
allows discrimination of loci due to a relatively high number of 
nucleotide differences within the amplicon. All but two loci, 
both located in 1p36.21, can be distinguished by at least one 
nucleotide sequence difference, as well as various indel posi- 
tions (fig. ^b). 

With respect to a previous study by Flockerzi et al. (2008), 
which employed a partially overlapping HML-2 region located 
approximately 150 bp upstream of our amplicon, we further 
improved the amplification strategy. We increased the 
number of HML-2 loci recognized by using mixes of several 
forward and reverse primers considering nucleotide differ- 
ences between primer and template sequence in the various 
HML-2 loci. In total, we designed four different forward and 
three different reverse primers. Twenty-eight HML-2 loci were 
identical in sequence for both primer regions for one of the 
primer variants, compared with 13 loci in the study by 
Flockerzi et al. (2008) that did not yet employ such primer 
variants. Another 1 1 HML-2 loci display one or two mis- 
matches in one or both primers with mismatches being lo- 
cated away from the primers' 3^-ends. Another six HML-2 loci 
show more than two mismatches, and three loci have a nu- 
cleotide variation in the 3^-end of the primer binding region, 
which likely inhibits amplification. Taken together, use of 
primer variants based on sequence variations of HML-2 loci 
significantly increases the number of amplifiable loci. 

In addition, PCR was performed under low stringency an- 
nealing conditions, further contributing to amplification of 
HML-2 loci for which primers do not match perfectly. This 
was confirmed subsequently as eight of the loci amplified 
from genomic and/or cDNA showed one or two mismatches 
to the primer variants (see later). 

PCR primers were verified on genomic DNA. Fifty-five gen- 
erated sequences could be unambiguously assigned to geno- 
mic HML-2 loci. In total, 18 different loci located on 11 
different chromosomes were amplified, with four loci being 
located on chromosome 3 and three loci each on 
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Fig. 1. — HERV-K(HML-2) provirus structure and sequence comparison of the investigated gag amplicon. (a) Schematic representation of the HML-2 
provirus structure and splice products. Splice donor (SD) and acceptor (SA) sites for env and the additionally spliced rec mRNA in type II proviruses are 
indicated. In type I proviruses, which lack a 292 bp sequence portion comprising the rec splice donor site (SD2) at the pol/env boundary, an alternative splice 
donor site upstream the rec SD2 is used to produce the shorter np9 mRNA. The position of the gag amplicon used here to investigate HML-2 transcription is 
indicated, (b) Neighbor joining tree depicting the absolute number of differences (excluding indel positions) between HML-2 loci for the proviral gag portion 
amplified in this study. Primer-binding regions were deleted before sequence comparisons. Taxon names give the chromosomal band, and, if applicable, 
HGNC approved names of loci. The scale bar depicts 1 nt sequence difference. All but two loci, both located in 1 p36.21 , can be distinguished by at least one 
nucleotide difference. 
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Fig. 2. — Summary of mismatches of generated gag cDNA sequences to their best BLAT match, (a) Gag sequences amplified from cDNA from various 
melanoma cell lines and samples, as well as melanocyte cell lines, were assigned to genomic HML-2 loci using BLAT. Given are absolute numbers of 
mismatches for 945 unambiguously assignable sequences to their best BLAT match. Greater than 10 mismatches were summarized. cDNA sequences with 
more than six mismatches were excluded from analysis, (b) Mismatches of HML-2 gag cDNA sequences generated from melanoma and melanocyte cell lines 
to their best BLAT match, before (black) and 24 h after (gray) irradiation with 200 mJ/cm^ UVB. Given are the relative number of mismatches of 527 
sequences generated from cell line cDNAs before and of 479 sequences after irradiation. 



chronnosonnes 5 and 7 (supplementary material, Supplemen- 
tary Material online). Five of those loci show one or two mis- 
matches to the employed forward and/or reverse primer 
sequences, showing that low stringency PGR conditions also 
allow for amplification of such loci. 

RT-PCR products were cloned, sequenced, and assigned to 
genomic HML-2 loci based on characteristic nucleotide differ- 
ences between loci. A cDNA/HML-2 locus assignment was 
regarded as unambiguous if there was only one BLAT's best 
match of a cDNA sequence to an HML-2 locus, and second, 
third, and so on ranking matches displayed more differences 
(see also the Materials and Methods section). Sequences with 
more than six mismatches were excluded from the analysis, 
allowing up to approximately 1 % difference in sequence due 
to RT-PCR and sequencing errors and interindividual se- 
quence differences (single-nucleotide polymorphisms [SNPs]). 
According to previous work, cDNA sequences with unusually 
high numbers of mismatches most likely arise due to ex vivo 
recombination between transcripts/cDNA from various tran- 
scribed loci (Flockerzi et al. 2007). The ERVK-6 locus in 7p22.1 
(HERV-K(HML-2.H0M)), which is known to be allelic in that it 
is present as a single provirus or as two tandem proviruses 
sharing a central LTR, was treated differently. As both proviral 
portions differ only by a few nucleotides along approximately 
7kb, the generated cDNA sequences mapping to 7p22.1 
could not be assigned unambiguously to either one of the 
two proviruses. The tandem provirus was thus regarded as 
one locus. 

Identification of Transcribed HML-2 Loci in Melanoma 
and Related Samples 

To identify transcribed HML-2 loci in melanoma, we 
analyzed melanoma cell lines SK-Mel-25, SK-Mel-28, 



MEWO, WM3734a, three melanoma RNA samples, two mel- 
anoma lymph node metastases, and the two melanocyte cell 
lines Benno and Oskar for transcribed HML-2 loci. 

In total, 991 gag gene-derived cDNA sequences of suffi- 
cient quality were generated and assigned to HML-2 loci by 
sequence comparisons. An average of four sequences (5%) 
per sample displayed more than one best match in BLAT 
searches and were excluded from analysis because they 
could not be assigned unambiguously to one particular 
HML-2 locus. As mentioned earlier, the tandem proviral 
HERV-K(HML-2.H0M) locus was regarded as one locus. The 
remaining unambiguously assignable 945 sequences showed, 
on average, 1.15 mismatches to their best BLAT match. 
Altogether, 918 (97.14%) of those sequences could be 
assigned unambiguously to a genomic locus with less than 
six mismatches, with 489 (51.75%) and 259 (27.41%) 
sequences showing no or a single, respectively, mismatch to 
the best BLAT match (fig. 2a). Ninety-five (10.05%), 39 
(4.13%), 13 (1.38%), 15 (1.59%), and 8 (0.85%) sequences 
showed 2, 3, 4, 5, and 6 mismatches, respectively. The re- 
maining 2.86% of sequences with more than six mismatches 
were excluded from further analysis. 

For each sample, an average of 83.45 cDNA sequences 
could be assigned unambiguously to an HML-2 locus, thus 
identifying that locus as transcribed in the particular sample. 
We identified a total of 22 different HML-2 loci to be tran- 
scribed in the investigated samples (table 1 ). Four of these loci, 
located on human chromosomes 8p23.1 (ERVK-26), 1 1q12.3 
(ERVK-27), 19q12 (ERVK-28), and 19q13.12 (ERVK-29), have 
not been described as transcribed before. Note that the 
8p23.1 locus is different from HERV-K115 that is located in 
the same chromosomal band. In accord with a recent initiative 
for assigning names to transcribed HERV loci (Mayer et al. 
2011), cDNA sequences corresponding to the four HML-2 
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loci so far not described as transcribed were submitted as EST 
sequences to Genbank (JZ1 6491 0, JZ1 64909, JZ1 6491 1 , and 
JZ164912), and accession nunnbers were linked to HUGO 
Gene Nonnenclature Committee approved designations for 
those HML-2 loci (ERVK-26 to ERVK-29 in table 1). On aver- 
age, eight different HML-2 loci were transcribed in each 
sample. The transcribed loci were located on 12 different 
chromosomes, with five loci being located on chromosome 
3 and three loci each on chromosomes 1 and 7. Six of those 
loci show one or two mismatches to the forward or reverse 
primer variants employed for amplification. 

As a higher transcription rate of a particular HML-2 locus 
(and thus higher amounts of the particular cDNA) increases its 
probability of being cloned and sequenced, relative cloning 
frequency of cDNA from a particular HML-2 locus roughly 
correlates with relative transcription rate of that locus. In 
each sample, few loci showed high relative cloning frequen- 
cies of up to 92%, suggesting a much higher level of tran- 
scription of those HML-2 loci compared with other transcribed 
HML-2 loci (table 1). Especially the ERVK-14 locus in 7q22.1 
shows high cloning frequencies ranging from 46% to 92% in 
five melanoma samples and both melanocyte cell lines. The 
ERVK-6 locus in 7p22.1 shows high cloning frequencies rang- 
ing from 21% to 75% in four melanoma samples. Loci 
ERVK-1, ERVK-7, and ERVK-5 show frequencies ranging 
from 29% to 53% in up to two melanoma samples. Most 
other HML-2 loci appear to be transcribed at relatively low 
levels. 

Transcription profiles of HML-2 loci were different for each 
investigated sample (fig. 3). None of the transcribed loci was 
found active in all samples. Nine of the HML-2 loci were de- 
tected exclusively in one of the melanoma samples, and all 
these loci but the ones in 1p31.1 (ERVK-1) and 12q14.1 
(ERVK-21) seem to be transcribed at relatively low levels. 
Loci displaying high relative cloning frequencies, for example, 
ERVK-14 in 7q22.1, ERVK-7 in 1q22, or ERVK-5 in 3q12.3, 
were detected in melanoma samples and in melanocytes. 
Exceptions were the aforementioned ERVK-1 locus in 
1p31 .1 that was found in only one of the melanoma samples 
with a relative cloning frequency of 29%, and the ERVK-6 
locus in 7p22.1 with relative cloning frequencies of up to 
75% in 7 of the melanoma samples, but ERVK-6 was not 
found transcribed in both melanocyte cell lines. 

Taken together, up to 12 different HML-2 loci were found 
to be transcribed in melanoma and melanocytes, and overall 
transcription patterns differed significantly between the vari- 
ous samples. Some loci, among them ERVK-6 in 7p22. 1 , were 
found transcribed exclusively in melanoma. 

Provirus Structure and Coding Capacity of HML-2 Loci 
Transcribed in Melanoma and Related Samples 

Regarding the proviral structure of transcribed HML-2 loci, 
most of them consist of two LTRs and internal proviral 



sequences (table 2). Fifteen loci have intact 5^- and 3^-LTRs, 
eight loci harbor internal deletions, and/or at least one LTR is 
missing. Seventeen loci exhibit internal sequences of >7,000 
bp. ERVK-1 5 harbors the shortest internal sequence of ap- 
proximately 3,900 bp. 

As for coding capacity of transcribed HML-2 loci for clini- 
cally relevant proteins, potential ORFs in gag and env genes 
were predicted (table 2). Six loci harbor an ORF coding for a 
Gag protein of the expected size of 667 (±1) aa. Gag encod- 
ing loci were found transcribed in melanoma samples and in 
melanocytes with cloning frequencies ranging from <5% for 
the ERVK-24 locus in 22q11.21 up to 75% for the ERVK-6 
locus in 7p22.1 . Three HML-2 loci encoding an envORf of the 
expected size of 700 (±1) aa (ERVK-6 in 7p22.1, ERVK-21 in 
12q14.1, ERVK-28 in 19q12) were only found transcribed in 
melanoma. HML-2 loci ERVK-21 and ERVK-28 were tran- 
scribed at relatively low frequencies of 14% and 1% in one 
melanoma RNA sample and one metastasis, respectively, 
whereas the ERVK-6 locus showed higher cloning frequencies 
of up to 75%. The two loci encoding both Gag and Env pro- 
teins of expected sizes, namely ERVK-28 in 19q12 and the 
ERVK-6 locus in 7p22.1, were both found transcribed exclu- 
sively in melanoma. Some more loci show ORFs for shorter or 
longer potential Gag and Env proteins, but it is difficult to 
predict whether those loci are translated into proteins at all. 

HML-2 proviruses encode two other proteins with interest- 
ing biological features, namely Rec and Np9, that can be 
encoded by so-called HML-2 type II or type I, respectively, 
proviruses that differ by a 292-bp sequence at the pol-env 
boundary, with type I loci lacking that sequence. Out of the 
1 0 transcribed type I loci, eight show an intact ORF for the Np9 
protein (unpublished data). Out of the 10 transcribed type II 
loci, eight show ORFs for Rec protein (Mayer et al. 2004). 
Three other HML-2 loci lack the type l/ll discriminating region. 

In summary, most HML-2 loci transcribed in melanoma 
feature both LTRs and close to complete internal proviral se- 
quences. Six and three, respectively, of those HML-2 loci show 
ORFs coding for Gag and/or Env proteins. Two loci encoding 
both Gag and Env were exclusively found transcribed in mel- 
anoma. Eight loci each encoding Rec or Np9 proteins were 
found transcribed in both melanoma and melanocytes. 

Identification of Rec and Np9 mRNA Producing 
HML-2 Loci 

We specifically examined in an additional analysis transcription 
of HML-2 loci producing rec or np9 mRNA. To do so, we 
performed RT-PCR on total RNA isolated from melanoma 
cell lines SK-Mel-25, SK-Mel-28, MEWO, and WM3734a 
and from melanocyte cell lines Benno and Oskar. The 
RT-PCR amplicon was based on previously reported rec and 
np9 exonic regions (Lower et al. 1995; Armbruester et al. 
2004) and was improved regarding unambiguous assignment 
of generated cDNA sequences to HML-2 loci compared with a 
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Fig. 3. — Transcription profiles of HERV-K(HML-2) loci in melanoma. HML-2 gag cDNA sequences were amplified from melanoma cell lines SK-Mel-25, 
SK-Mel-28, MEWO, and WM3734a, three melanoma RNA samples, two lymph node metastases, and melanocyte cell lines Benno and Oskar. HML-2 loci are 
indicated on the x axis by chromosomal band and, if available, HGNC approved names. Samples are indicated on the y axis. I. n., lymph node; mel., 
melanoma. Given on the z axis are relative cloning frequencies as percentages of cDNA sequences that could be assigned unambiguously to particular HML-2 
loci, approximately corresponding to transcription levels of those loci in the particular sample. See table 1 for further details. 



previous study (Mayer et al. 2004). The 5^-end of the forward 
primer was located right on the rec and np9 start codon. The 
prinner sequence was as reported in Armbruester et al. (2004), 
but additional primer variants considering sequence variations 
between HML-2 loci were included in PGR amplifications (see 
Materials and Methods). The reverse primer was located ap- 
proximately 180nt downstream from the reported np9 and 
rec stop codons. PGR products were approximately 580 bp 
and approximately 370 bp in size for rec and np9 mRNA-de- 
rived cDNA. 

We generated a total of 91 cDNA sequences from the six 
cell lines. We could unambiguously assign 84 cDNA sequences 



to genomic HML-2 loci. Four type II and three type I HML-2 loci 
were identified as producing rec or np9 transcript in the in- 
vestigated samples (table 3). The Rec encoding ERVK-6 locus 
in 7p22.1 showed highest relative cloning frequencies of up to 
88% in all cell lines but Benno. Although 7p22.1 gag-derived 
cDNAs were exclusively found in melanoma, rec transcript was 
also found in both Benno and Oskar melanocyte cell lines. Np9 
transcript from loci 1q22 (ERVK-7), 3q12.3 (ERVK-5), and 
22q11.21 (ERVK-24) was found in both the melanoma and 
melanocyte cell lines. 

More than 90% of the generated rec and np9 cDNA se- 
quences showed ORFs encoding proteins of expected sizes of 
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Table 2 

Characteristics of HERV-K(HML-2) Loci Identified as Transcribed in This Study^ 



Genomic Localization 



Structure 
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End 
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10p14 
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19q12^ 
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*The genomic localization of the complete HML-2 locus with LTRs, if present, is given in the first five columns (assembly March 2006 NCBI36/hg18). The structure column 
shows an alignment of the complete loci compared with the HERV-K(HML-2.H0M) reference sequence (GenBank accession no. AF074086.2). Non-HML-2 repeat insertions in 
the loci in 11q12.3 and 19q12 were deleted before alignment (see later). Gray boxes represent the HERV sequence with black marks representing differences to the 
reference. Indel positions are indicated by horizontal lines. Presence (+) or absence (-) of 5'- and 3'-LTRs is given, as well as deletions (A in the LTR sequences, compared with 
the human LTRS consensus sequence as provided by RepBase [http://www.girinst.org/repbase/update/browse.php, last accessed January 31, 2013]). ORFs for gag and en\/ were 
predicted using geneious software (Biomatters Ltd.). A complete ORF coding for proteins of 666 aa (±1) for Gag or 699 aa (±1) for Env is indicated by "+." Complete 
deletion of the coding region is indicated by The last column gives the classification as type I or II, based on a 292 bp deletion within the pol-env boundary in type I 
proviruses (see text). Larger truncations in that region allowing no discrimination are indicated by ORFs for accessory proteins Np9 or Rec, encoded by type I or II 
proviruses, respectively, are given (see also Mayer et al. 2004). 

h"he locus in 3p25.3 was only found after irradiation with UVB (see Results for details). 

■^The 3'-LTR of the 7q21.1 and 7q34 locus is annotated on the opposite strand. 

"^The 5'-LTR is interrupted by AluY/AuYb8/AluY between nucleotides 891 and 945, the internal sequence shows an insertion of AluSg between 5,717 and 5,718, and 
(TA)n/AluY/(CAG)n, as well as an additional 63-bp fragment of LTRS after bp 5,998, all annotated on the opposite strand, 
^he internal HERV sequence is interrupted by an AluYaS element between nucleotides 4,583 and 4,584. 



104 aa or 105 aa for Rec and 75 aa for Np9. The encoded 
proteins showed no or only a few amino acid changes com- 
pared with reported Rec and Np9 protein sequences (UniProt 
accession no. Q69383, P61 571-61 576, P61 578-61 583). One 
rec cDNA sequence mapping to 7p22.1 (ERVK-6, 
HERV-K(HML-2.H0M)), generated from SK-Mel-28, showed 
a 1-bp deletion in the 5^-region of the rec ORF resulting in a 
frameshift. Two more sequences mapping to 7p22.1, 



obtained from cell line WM3734a, showed a nonsense muta- 
tion resulting in a premature stop codon. However, we inter- 
pret those three sequences as mutant sequences generated in 
vitro during cDNA generation or PGR. 

Taken together, rec and np9 mRNA from, in total, seven 
HML-2 loci were found transcribed in both melanoma and 
melanocytes. Rec and Np9 protein encoding loci were found 
transcribed in both tumor and normal samples. 
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Table 3 

Relative Cloning Frequencies of Type 



HML-2 Loci Identified as Transcribed in This Study^ 



Type 



HML-2 Locus 



Melanoma Cell Lines 



Name 



Chr 



Start 



End 



Band 



SK-Mel-25 



% 



SK-Mel-28 



% 



MEWO 



% 



WM3734a 



% 



Melanocytes 



Benno Oskar 



# 



% 



% 



Np9 



ERVK-7 
ERVK-5 
ERVK-24 



1 
3 
22 



153,863,081 
102,893,427 
17,306,187 



153,872,260 
102,902,549 
17,315,361 



1q22 
3q12.3 
22q11.21 



6.25 



5.88 



6.67 
6.67 



3 30.00 
2 20.00 
1 10.00 



Rec 



ERVK-4 
ERVK-6 
ERVK-17 



2 
3 
7 

10 



187,093,878 
127,091,992 
4,588,583 
101,570,559 



187,095,344 
127,101,129 
4,606,557 
101,577,735 



2q32.1 
3q21.2 
7p22.1 
10q24.2 



1 10.00 



16 94.12 



14 



100 



13 86.67 



2 
13 



12.50 
81.25 



10.00 
20.00 



12 100 



^We specifically investigated transcription of type I and II HML-2 loci coding for accessory proteins Np9 and Rec, respectively, performing Np9/Rec-specif ic RT-PCR on total 
RNA isolated from melanoma and melanocyte cell lines. Given in the first column are the official locus names according to Mayer et al. (2011) and the chromosome positions 
of the different HML-2 loci found as transcribed in the investigated samples. Given are the absolute number of sequences (#) and the relative cloning frequencies (%) for the 
different loci in each sample (number of sequences unambiguously assigned to a specific locus divided by the total number of sequences generated from the particular 
sample). 



l(dentification of Reverse-Transcribed HML-2 Transcripts 

Of further note, a significantly mutated HML-2 type II locus in 
2q32. 1 , with less than 500 bp of internal sequence, was found 
to be transcribed in the Benno melanocyte cell line. The locus 
consists of three short internal portions, representing approx- 
imately 1 00 bp upstream of the env splice donor site immedi- 
ately downstream from the 5^-LTR, approximately 280 bp 
from between the env splice acceptor site and the rec splice 
donor site, and approximately 95 bp downstream from the 
rec/np9 splice acceptor site, with all splice sites being incom- 
plete (fig. 4a). The locus harbors approximately 90 bp from the 
5^-LTR's 3^-end and a 3^-LTR lacking the 3' most approximately 
150 bp. Nevertheless, both primer regions are present in that 
locus. More detailed sequence comparisons indicate that this 
particular locus is a retrotransposed HML-2 mRNA that was 
spliced similar to rec mRNA except for a short nucleotide 
stretch. It displays hallmarks typical for reverse transcription 
of mRNA by L1 machinery. The locus is flanked at the 3^- 
end by an 8-nt-long poly-A stretch located approximately 
90 bp downstream from the HML-2 3^-LTR's poly-A signal, 
and it is encompassed by a target site duplication (5^: AATC 
TGAATTCTT; 3^: AATCTGA7TTCTT) that displays similarities to 
the reported L1 target site consensus sequence (Jurka 1997; 
Ostertag and Kazazian 2001). Notably, this locus was not an- 
notated as a processed pseudogene in pseudogene-related 
annotation tracks at UCSC Genome Brower. The locus was 
very likely formed before the split of Hominoidea. Primate 
genome comparisons at the UCSC Genome Browser (Kent 
et al. 2002) indicate that nearly identical genomic regions in- 
cluding the HML-2 locus are present in the genomes of chim- 
panzee, gorilla, orangutan, and gibbon, but the HML-2 locus 
is missing in rhesus and marmoset (fig. 4b). The latter species 
belongs to new world monkeys that entirely lack 
HERV-K(HML-2) sequences. A potential ORF of 408 bp (136 



aa) was predicted for that retrotransposed HML-2 locus. The 
N-terminal 60 aa of the resulting hypothetical protein show 
90% sequence identity (93.33% similarity) to previously de- 
scribed Rec and Env protein sequences (UniProt accession no. 
Q69383, Q69384), which are identical with each other in the 
N-terminal 85 aa. The central and C-terminal portions lack 
significant similarity to Rec or Env proteins. 

Two cDNA sequences generated from melanocytes Benno 
mapped to an HML-2 type II locus in 10q24.2 that carries a 
1 ,372 bp deletion within the env gene (fig. 5a). The locus was 
formed within an intron of the ABCC2 gene that is transcribed 
in antisense orientation. Notably, the 10q24.2 locus lacks the 
intron portion of the rec/np9 splice acceptor site. The 5^-end of 
the missing env portion resembles a splice donor site with only 
the intron portion missing. It appears that this locus resembles 
on the DNA level another or a partial splice variant of HML-2 
transcript. This locus was likely formed recently in human evo- 
lution as it is only present in the human genome but missing in 
the corresponding genome region of non-human primates 
(fig. 5b). Therefore, we speculate that this locus represents a 
(partially) spliced HML-2 transcript that was reverse tran- 
scribed into a DNA copy in a retroviral fashion after the evo- 
lutionary split of human from chimpanzee. Furthermore, a 
transcript is generated from that locus, and an ORF for a hy- 
pothetically larger protein of 214 aa is encoded that is identical 
to HML-2 Env protein for the N-terminal 200 aa (and thus to 
Rec for aa 1-85) and to Rec protein for the C-terminal 14 aa. 
In other words, the encoded protein is a chimera of Env and 
Rec protein portions. 

One type II cDNA sequence generated from RNA from me- 
lanocytes Benno matched best to an HML-2 locus in 3q12.3 
(ERVK-5) but nevertheless displayed a relatively high number 
of six mismatches to that locus. Sequences showing the same 
characteristic nucleotide differences when compared with 
ERVK-5 were also found in melanoma-unrelated samples 
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Fig. 4. — HERV-K(HML-2) locus in 2q32.1 transcribed in nnelanocyte cell line Benno shows typical features of retrotransposition by LI machinery, (a) 
Depicted in the center-right part of the figure is a dot nnatrix comparison (window size: 30; minimum score: 50%; jump: 1) of the 2q32.1 locus with the 
HERV-K(HML-2.H0M) proviral locus (see text) as a reference. Locations of retroviral LTRs, gag, pro, pol, and env gene,s and splice donor (SD) and splice 
acceptor (SA) sites are indicated for the reference sequence next to the y axis. Subregions in the 2q32.1 locus (chr2 locus), as indicated in the dot matrix 
comparison, were compared in more detail with a cDNA assigned to this locus (cDNA), a previously reported rec mRNA sequence (Rec) (see text) and the 
sequence of HERV-K(HML-2.H0M) (HOM). Numbers above the alignment indicate nucleotide positions with respect to the HOM sequence. More detailed 
sequence comparison shows that the locus represents a retrotransposed HML-2 mRNA that was spliced similar to rec mRNA, with splice donor and acceptor 
sites resembling the ones in rec mRNA (ii-v). The 2q32. 1 locus is flanked by target site duplications (TSDs) and a poly(A)-signal in the 3'-end (i, vi, and vii). The 
position of the forward primer used for RT-PCR is indicated in (iii). (b) Graphical depiction of the HERV-K(HML-2) locus in human chromosome 2q32.1 as 
provided by UCSC Genome Browser (Kent et al. 2002). The retrotransposed HML-2 sequence was detected and annotated by Repeatmasker v.3.2.7, 
indicated by a black horizontal bar. The same genome region encompassing the HML-2 locus is present in other primate genomes as indicated by greenish or 
black-colored horizontal boxes. Rhesus and marmoset genomes harbor the same genome region but lack exactly the HML-2 portion as indicated by thin 
horizontal lines, (b) Compiled from annotation tracks provided in hg18 and hg19 (http://genome.ucsc.edu, last accessed January 31, 2013). 
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Fig. 5. — An HML-2 type II locus in 10q24.2, transcribed in melanocyte cell line Benno, shows a 1,372 bp deletion in env. (a) Depicted in the center-right 
part of the figure is a dot matrix comparison of the ERVK-1 7 locus in 1 0q24.2 with HERV-K(HML-2.H0M) and detailed alignments of indicated splice sites of 
the 10q24.2 genomic sequence (chrlO locus) with two cDNAs assigned to this locus (cDNAI, cDNA2), a rec mRNA sequence (Rec) and 
HERV-K(HML-2.H0M) sequence (HOM). Numbers above the alignment indicate nucleotide positions in HOM (see also the legend of fig. 4). The position 
of the forward primer used for RT-PCR is indicated in (i). The 10q24.2 locus lacks the intron portion of the rec splice acceptor site (iv). The 5' -end of the 
missing env portion resembles an alternative splice donor site with the corresponding intron portion missing (iii) approximately 330 bp downstream the 
regular rec splice donor site (ii). The locus appears to resemble, on the DNA level, another splice variant of HML-2 transcript, and it might represent a spliced 
HML-2 transcript that was reverse transcribed into a DNA copy in a retroviral fashion, (b) The reverse-transcribed locus is present in the human genome but 
missing in the genomes of non-human primates. See legend of figure 4 for further details. 



investigated by us in a separate study. We were able to isolate 
the identical sequence from corresponding genomic DNA. 
Therefore, some isolated cDNA sequences point to a hitherto 
unidentified allelic variant of an existing HERV-K(HML-2) locus, 
alternatively an, as of yet, unidentified HML-2 locus (details to 
be published elsewhere). 

Taken together, we identified two transcribed HML-2 loci 
that very likely were generated by reverse transcription of 
spliced HML-2 transcript either by L1 machinery or in a retro- 
viral fashion. The HML-2 locus on chromosome 2q32.1 may 



encode a protein half Rec-like, and the chromosome 10q24.2 
locus ERVK-1 7 may encode a chimeric Env/Rec protein. We 
also obtained evidence for a novel HML-2 type ll-like allele that 
displays several nucleotide differences within the Rec coding 
sequence. 

Polymorphic Loci Do Not Account for Differences in 
Transcription Profiles 

Several HERV-K(HML-2) loci have been described as polymor- 
phic in the human population (Barbulescu et al. 1999; 
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Fig. 6. — Quantitation of HERV-K(HML-2) gag transcription by qRT-PCR. (a) Relative HML-2 gag transcript levels in melanoma cell lines SK-Mel-25, 
SK-Mel-28, MEWO, and WM3734a, three melanoma RNA samples, a melanoma lymph node metastasis (LM), and melanocyte cell lines Benno and Oskar, as 
well as GCT cell lines NCCIT and Tera-1 . HML-2 transcript levels were normalized to transcript levels of G6PDH and RPII housekeeping genes. Gag transcript 
level in melanocyte cell line Benno before UV treatment was taken as reference. Black bars indicate minimum and maximum relative levels of gene 
expressionas calculated by StepOne software (see Materials and Methods). Note the different scale for NCCIT and Tera-1 cells, (b) Relative HML-2 gag 
transcription in melanoma and melanocyte cell lines before (dark gray bars) and 24 h after irradiation (light gray bars) with 200 mJ/cm^ UVB. 



Hughes and Coffin 2004; Macfarlane and Simmonds 2004; 
Mayer et al. 2005). Of those polymorphic loci, the ERVK-6 
locus in 7p22.1 (HERV-K(HML-2.H0M)) and four loci located 
in 1p3l.1 (ERVK-1), 3p25.3 (ERVK-2), 3q21.2 (ERVK-3), and 
6q14.1 (ERVK-9) were identified as transcribed in one or sev- 
eral of the investigated samples. As a lack of cDNA, thus tran- 
script, from one or several of those loci might be explained by 
absence of the particular locus in a sample's genomic DNA, 
we tested the genomic DNA of the available cell lines for 
presence of those HML-2 loci. We furthermore included in 
this analysis the HERV-K113 locus in 19p12 and the 
HERV-K115 locus in 8p23.1, which were not found tran- 
scribed in our samples. Polymorphic HML-2 loci can consist 
of a complete provirus, a solitary LTR resulting from homolo- 
gous recombination between the 5^- and 3^-LTR, or an empty 
site if the particular HML-2 insertion is not fixed in the popu- 
lation. Using specific PGR primers flanking an HML-2 locus and 
a flanking primer plus a primer within the LTR, sizes of the 
generated PGR products were indicative of the different 
alleles. 

For most examined polymorphic HML-2 loci, an approxi- 
mately 1 0-kb-long PGR product could be generated in most of 
the samples, indicating presence of at least one copy of the 
particular HML-2 loci in those samples (supplementary fig. S3, 
Supplementary Material online). The HERV-K113 locus could 
not be amplified from any of the samples, and the HERV-K1 1 5 
locus (at least one copy) was identified only in melanocyte cell 
line Benno. This finding is not unexpected in the light of pre- 
viously reported low allele frequencies for those two loci 



(Burmeister et al. 2004; Macfarlane and Simmonds 2004; 
Moyes et al. 2005). 

Therefore, presence of at least one copy of most polymor- 
phic proviruses in nearly all the samples argues against differ- 
ences in HML-2 transcription profiles being due to lack of the 
respective HML-2 loci in the investigated samples. 

Quantitation of HML-2 Transcription and HML-2 
Transcription Profiles Following UVB Irradiation 

We quantified overall HML-2 gag transcription levels in the 
investigated melanoma and melanocyte samples and in GGT 
cell lines NGGIT and Tera-1, which both display upregulated 
HML-2 transcription (Florl et al. 1 999; Ruprecht et al. 2008), by 
quantitative real-time RT-PGR. HML-2 transcript levels were 
normalized to transcript levels of G6PDH and RPII housekeep- 
ing genes, and HML-2 transcript level in melanocyte cell line 
Benno before UV treatment was taken as reference. 
Melanocyte cell line Oskar and melanoma cell lines SK-Mel- 
25, SK-Mel-28 and MEWO showed HML-2 transcription levels 
similar to that of melanocyte cell line Benno (fig. 6a). 
WM3734a cells, the three melanoma RNA samples and the 
lymph node metastasis showed HML-2 transcript levels 5-23 
times higher than Benno. Still, HML-2 transcript levels in GGT 
cell lines NGGIT and Tera-1 were 733-fold and 77-fold higher 
compared with Benno. 

Previous studies reported deregulation of HML-2 transcrip- 
tion in melanoma cell lines following UVB irradiation (Schanab 
et al. 201 1 ). Such deregulation may be caused by deregulation 
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Fig. 7.— Influence of UV irradiation on HERV-K(HML-2) transcription. Melanoma cell lines SK-Mel-25, SK-Mel-28, MEWO, and WM3734a and mela- 
nocyte cell lines Benno and Oskar were irradiated with 200 mJ/cm^ UVB (see text). Given are relative cloning frequencies as percentages of gag-derived cDNA 
sequences that could be unambiguously assigned to particular HML-2 loci before (dark gray bars) and 24 h after (light gray bars) UVB irradiation of each cell 
line. 



of transcription of particular HML-2 loci or by general dereg- 
ulation of all transcribed HML-2 loci. We therefore investi- 
gated changes in transcription profiles of HML-2 loci 
following UV irradiation. Following the experimental strategy 
described in Schanabet al. (201 1), melanoma and melanocyte 
cell lines were irradiated with 200 mJ/cm^ of UVB, and total 
RNA was isolated from irradiated cells 24 h later. HML-2 tran- 
scription profiles were investigated as described earlier. In 
total, 528 sequences were generated of which 49 (9.28%) 
showed more than one best match in BLAT searches and were 
thus excluded from further analysis. As mentioned earlier, the 
HERV-K(HML-2.H0M) tandem provirus was regarded as one 
locus. The remaining 479 (90.72%) sequences could be 
unambiguously assigned to a genomic HML-2 locus with an 
average mismatch count of 2.27. Four hundred thirty-six 
(91.02%) of those sequences could be assigned 



unambiguously to a genomic locus with less than six mis- 
matches (fig. 2b) and were analyzed further. 

Overall, HML-2 gag transcript levels were also quantified in 
UV-irradiated cell lines 24 h after exposure to 200mJ/cm^ 
UVB. Each of the cell lines showed a considerable decrease 
in relative HML-2 expression compared with nonirradiated 
cells (fig. 6b). The strongest effect was observed in melanocyte 
cell line Benno and melanoma cell line WM3734a, which 
showed highest relative HML-2 transcript levels before treat- 
ment. In both cell lines, relative HML-2 transcript levels were 
reduced to approximately 14% of the transcription in nonir- 
radiated cells. 

UV irradiation modified transcription profiles of HML-2 loci 
(fig. 7). Except for SK-Mel-25, the total number of transcribed 
loci slightly increased after irradiation. On average, 10 loci 
were transcribed after irradiation, in contrast to seven loci in 
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untreated cells, with no specific activation of particular loci. 
The relative cloning frequency of cDNA (thus relative transcrip- 
tion level) of the often donninantly transcribed ERVK-14 locus 
in 7q22. 1 , which showed frequencies up to 92% before treat- 
nnent, was reduced to a frequency of, at most, 39% following 
irradiation. The ERVK-2 locus in 3p25.3 was found transcribed 
in UV-irradiated SK-Mel-28 and MEWO cells but had not been 
found transcribed in untreated cells (table 2). Of further note, 
nnelanonna and melanocyte cell lines did not display charac- 
teristic differences in changing transcription profiles. 

Taken together, all melanoma samples and a melanoma- 
derived cell line show higher HML-2 gag transcript levels com- 
pared with melanocytes. UVB irradiation decreased HML-2 
transcript levels in all irradiated melanoma and melanocyte 
cell lines. Transcription levels of several HML-2 loci appear to 
be altered by UV irradiation in melanoma and melanocyte cell 
lines, with nonuniform responses. 

Discussion 

Several previous studies suggested an involvement of 
HERV-K(HML-2) in the etiology of melanoma. If HML-2 se- 
quences turn out to be of relevance in the development of 
melanoma, it will certainly be of interest to identify the tran- 
scribed HML-2 loci, which actually account for HML-2 RNA 
and proteins detected in previous studies (Buscher et al. 2005, 
2006; Reiche et al. 2010; Stengel et al. 2010). We therefore 
identified in this study for the first time transcribed HML-2 loci 
in melanoma and biologically related samples. We were thus 
able to address whether transcription of certain HML-2 loci 
distinguishes melanoma from healthy samples and whether 
previously reported increased HML-2 expression in melanoma 
is due to specific activation of certain HML-2 loci or a more or 
less even upregulation of several loci. 

On the basis of a previous study (Flockerzi et al. 2008), we 
improved the HML-2 cDNA amplification strategy by increas- 
ing the number of amplifiable loci using mixes of primer var- 
iants considering nucleotide differences in the primer binding 
regions of HML-2 loci. In contrast to Flockerzi et al. (2008), 
who identified transcribed HML-2 loci based on amplicons in 
the gag and envgene regions, we restricted our analysis to an 
amplicon in the gag gene. All loci detected as gag- or env- 
derived cDNAs in the previous study can also be detected by 
the gag primer variants used in this study. Furthermore, all 
HML-2 loci with a perfect match to the env primers used in 
the previous study are also recognized by primer variants in 
this study, with the exception of one locus harboring a dele- 
tion in gag (but not described as transcribed so far). A second 
amplicon located in env would thus not have increased the 
number of amplifiable loci but instead would have only 
resulted in twice the amount of work and expense. 

Generated HML-2 cDNA sequences were assigned to ge- 
nomic loci based on characteristic (private) nucleotide differ- 
ences distinguishing the various HML-2 loci in the human 



genome from each other. As many of the HML-2 loci repre- 
sent evolutionarily young HERV loci, the number of random 
mutations accumulated over time is lower than for other, 
(much) older HERV groups. We therefore designed the PGR 
product amplified from cDNA to be >600bp to ensure as 
much as possible that cDNA sequences generated by Sanger 
sequencing harbor sufficient numbers of private nucleotide 
differences to allow reliable assignment of cDNA sequences 
to specific loci. Although a next-generation sequencing-based 
approach would have allowed generation of much higher 
numbers of cDNA sequence reads, maximum read lengths 
would have been rather short, which in the case of HML-2 
likely complicates reliable cDNA assignment or makes it im- 
possible for many HML-2 loci. For a few HML-2 loci, there was 
only one nucleotide along the investigated 620 bp distinguish- 
ing one from the other locus (fig. ^b). Even for this study's 
much longer Sanger sequencing reads, cDNA assignment was 
sometimes complicated by sequence differences introduced 
by RT-PGR/sequencing errors, SNPs, or artifactual cDNA se- 
quences due to recombination of transcripts from different 
HML-2 loci during cDNA generation (Flockerzi et al. 2007) 
resulting in mismatches of cDNA sequences to the genomic 
reference. However, the best BLAT match to one particular 
HML-2 locus is expected to be correct, if matches to other 
HML-2 loci display more differences. We excluded cDNA se- 
quences if unambiguous assignment to a particular HML-2 
locus was not possible. All in all, less than 3% and 9% of 
the sequences generated from samples before and after UV 
irradiation, respectively, had to be omitted due to >6 mis- 
matches or unambiguous assignments. 

Our strategy for cDNA generation by random priming 
explicitly does not differentiate between sense and antisense 
transcripts of HML-2 loci. This may be of importance for 
HML-2 loci lacking the 5'-LTR but having a 3'-LTR. Because 
the HML-2 LTR can exert bidirectional promoter activity 
(Domansky et al. 2000), such mutated HML-2 loci may be 
transcribed by the 3^-LTR in antisense. For HML-2 loci lacking 
both LTRs transcripts may stem from upstream or down- 
stream promoters and the HML-2 loci may therefore be tran- 
scribed in sense or antisense, respectively (see later). However, 
we aim at identifying all the HML-2 loci transcribed in some 
way because antisense transcribed loci may exert some bio- 
logical function. For instance, HML-2 antisense RNA might 
regulate expression of genes by RNAi. Overall, for most 
HML-2 loci having a 5'-LTR and HML-2 5'-LTRs usually display- 
ing promoter activity, HML-2 loci only transcribed in antisense 
very likely comprise a minority (Vinogradova et al. 2001; 
Buzdin et al. 2006). 

We verified our cDNA amplification strategy by PGR on 
genomic DNA. Eighteen different HML-2 loci were identified 
from unambiguous assignment of 55 sequences to genomic 
HML-2 loci. As a total of 39 HML-2 loci show sufficient se- 
quence similarity to the primer variants and could thus be 
amplified in theory, the 18 different cloned HML-2 might 
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initially suggest a somewhat preferential amplification of cer- 
tain loci. However, that observed distribution did not correlate 
with HML-2 loci subsequently cloned from cDNA. For in- 
stance, HML-2 loci in 3q27.2 (ERVK-11), 19q12 (ERVK-28), 
and 22q11.21 (ERVK-24) showing somewhat higher cloning 
frequencies when amplified from genomic DNA (supplemen- 
tary fig. S2, Supplementary Material online) were identified at 
relatively low frequencies when amplified from cDNA 
(table 1). In subsequent amplifications from cDNA, besides 
13 of the loci amplified from genomic DNA, we identified 
an additional nine loci not amplified from genomic DNA 
showing that, in fact, more than the aforementioned 18 loci 
can be amplified. We wish to mention that 50 more se- 
quences generated from genomic DNA had to be omitted 
from analysis as they showed greater than 6 and up to 26 
mismatches to their best BLAT match. Such high numbers of 
unassignable sequences were not observed for sequences 
generated from cDNA, where >90% of the sequences 
could be unambiguously assigned. We suspect that the high 
number of unassignable sequences from genomic DNA is be- 
cause of recombinants that were created during PGR because 
of an excess of potential recombination partners present in the 
genomic DNA, which is in contrast to the situation in cDNA 
where much less such recombination partners are present. 
This observation may be of importance for other studies in- 
volving PGR amplification of high-copy number repetitive ele- 
ments. A relatively high number of recombinations may be 
produced during PGR in such experiments, and it may be ad- 
visable to verify the desired sequence by sequencing. Also, a 
very low amount of template DNA might decrease the 
number of recombinations during PGR. 

As for the identification of transcribed HML-2 loci, 23 tran- 
scribed HML-2 loci is very similar to the number previously 
described for GGTs and brain tissues (Flockerzi et al. 2008). 
Nineteen loci were detected in both studies. Four loci 
(ERVK-25 to ERVK-28, see table 1) have not been described 
as transcribed before and were therefore added to databases 
of the HGNG that also assigns locus designations to tran- 
scribed HERV loci (Mayer et al. 201 1). When comparing our 
results with results from a recent microarray-based study 
(Perot et al. 2012), 16 loci identified in our study were also 
reported as transcribed in Perot et al. (2012), whereas seven 
loci were not identified in that study, though, melanoma was 
not specifically investigated in Perot et al. Overall transcription 
profiles of HML-2 loci seem different in melanoma and mela- 
nocytes when compared with those found for GGTs (Flockerzi 
et al. 2008). In GGTs, especially transcription of ERVK-24 in 
22q11.21, ERVK-23 in 21q21 .1, and ERVK-20 in 1 1q23.3 ap- 
pears specifically upregulated, yet those loci appear only low 
level transcribed in melanoma based on cloning frequencies, 
or were not detected as transcribed at all, specifically 
ERVK-20. Furthermore, overall HML-2 transcription profiles 
appear clearly different for the various samples, that is, each 
sample displayed a particular set of transcribed HML-2 loci 



that overlaps only partially with those of other samples. As 
for HML-2 loci distinguishing melanoma from melanocytes, 
locus ERVK-6 in 7p22.1 appears as an interesting candidate 
because it was exclusively detected in seven of the melanoma- 
derived samples but not in the melanocyte cell lines. More 
specific and larger scale analysis might reveal the ERVK-6 
locus as a biomarker for melanoma. 

Observed differences in transcription profiles or lack of 
transcripts from particular HML-2 loci is not explained simply 
by lack of those loci in investigated samples. When testing for 
presence of 7 HML-2 loci described to be polymorphic in the 
human population (Barbulescu et al. 1 999; Hughes and Goffin 
2004; Macfarlane and Simmonds 2004; Mayer et al. 2005), 
we found at least one copy of each locus in nearly all the 
investigated cell lines. 

Most of the transcribed HML-2 loci harbor more or less 
full-length internal sequences and 5^- and 3^-LTRs, with the 
former harboring the classical proviral promotor. For 5^-LTR 
lacking HML-2 loci, transcription might have been initiated by 
bidirectional promotor activity of the 3'-LTR (see earlier). 
ERVK-14 in 7q22.1 and ERVK-17 in 10q24.2, both lacking 
the 5^-LTR, are located intronically within genes, thus different 
splicing might account for the transcript of the corresponding 
HML-2 locus. ERVK-14 is located intronically and in antisense 
orientation within the LHFPL3 gene. It has been reported that 
the ERVK-14 5^-portion contributes an internal exon for 14 
annotated antisense ESTs (Kim and Hahn 2010). ERVK-17 is 
located intronically in antisense orientation in the MRP2 gene, 
a member of the MRP subfamily, mediating transport of an- 
ticancer agents and contributing to drug resistance (Kruh and 
Belinsky 2003). For other HML-2 loci yet unidentified promo- 
tors located nearby might have initiated transcription, or read- 
through transcription from neighboring genes might have 
produced the HML-2 transcript, for example, ERVK-15 in 
7q34 is located approximately 1 .6 kb upstream of the house- 
keeping gene SSBP1 that has been reported as being involved 
in mitochondrial biogenesis and as being a subunit of a 
single-stranded DNA-binding complex involved in the mainte- 
nance of genome stability (Tiranti et al. 1995; Huang et al. 
2009). 

Several HML-2 loci identified as transcribed harbor 
full-length ORFs for retroviral Gag and Env proteins. The 
three loci theoretically capable of producing full-length Env 
protein were exclusively found transcribed in melanoma. 
Although ERVK-21 in 12q14.1 and ERVK-28 in 19q12 were 
each found transcribed in only one of the samples, the ERVK-6 
locus in 7p22.1 seems to be of particular interest, as it was 
found transcribed in seven melanoma samples but was not 
found transcribed in melanocytes. A previous study detected 
spliced en\/mRNA and Env protein in cell lines SK-Mel-28 and 
MEWO (Buscher et al. 2005). In both cell lines, our study iden- 
tified the ERVK-6 locus as transcribed, thus previously de- 
tected mRNA and protein might have originated from this 
locus. Moreover, ERVK-6-locus-encoded Env protein might 
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account for Env-directed antibodies detected in sera fronn nnel- 
anoma patients (Hahn et al. 2008). For GCTs, there was no 
obvious correlation between expression of Env-encoding 
HML-2 loci and presence of antibodies (Flockerzi et al. 
2008). As the ERVK-6 locus also harbors an ORF for a full- 
length Gag protein, ERVK-6 might also account for Gag 
protein detected in melanoma metastases and Gag-directed 
antibodies detected in sera from melanoma patients (Buscher 
et al. 2005; Hahn et al. 2008). Anyway, additional studies 
investigating more samples will be required to elucidate the 
biological significance of the ERVK-6 locus in melanoma. Two 
more loci harboring gagORFs, ERVK-9 in 6q14.1 and ERVK-28 
in 19q12, are expressed in only one of the melanoma samples 
and are transcribed at very low levels based on cDNA cloning 
frequencies. The remaining three loci with gag ORFs are ex- 
pressed in both melanoma and melanocytes and show no 
striking difference concerning relative cloning frequencies 
thus transcript levels. The same was found for GCTs; 
Gag-encoding loci were found transcribed in tumor samples 
from antibody positive and negative patients (Flockerzi et al. 
2008). Generation of antibodies against HML-2 Gag and Env 
protein in melanoma and GCT patients appears as a more 
complex mechanism that is determined not just by activation 
of Gag and/or Env encoding HML-2 loci. 

Previous studies suggested an involvement of HML-2- 
encoded accessory proteins Rec and Np9 in tumorgenesis 
(Boese et al. 2000; Armbruester et al. 2004; Galli et al. 
2005; Denne et al. 2007). We found Rec and np9 transcripts 
in both melanoma and melanocyte cell lines. A previous study 
by Buscher et al. (2006) detected rec and np9 mRNA in mel- 
anoma as well as rec mRNA in melanocytes, yet Rec and Np9 
proteins were detected in only a small number of tumors. 
Thus, presence of rec and np9 mRNA does not necessarily 
imply presence of corresponding proteins at detectable 
levels. We did not investigate protein expression in our study 
or specifically quantify rec and np9 transcription. However, 
overall Rec and/or Np9 expression in melanocytes may be 
lower than in melanoma or mRNA or proteins may be de- 
graded more quickly. 

A type II locus in 2q32.1 was found transcribed in melano- 
cytes Benno. That locus is of particular interest because it pre- 
sents features of a processed pseudogene generated by L1 
retrotransposition machinery. Retrotransposed elements of 
the HERV-H and HERV-W group have been described before 
(Goodchild et al. 1 995; Pavlicek et al. 2002). To the best of our 
knowledge, this is the first reported locus of the HERV- 
K(HML-2) group that was formed by retrotransposition and 
that is transcribed despite its truncated structure. Additional 
evidence for this locus being transcribed is provided by EST 
sequence entries AW968573, AA1 67225, BF921025, 
GD144208, and AA491129 that locate just upstream of or 
partially overlap with the locus. Analysis of primate genome 
sequences furthermore indicated that the locus was formed 
just before the evolutionary split of Hominoidea. The HML-2 



locus in chromosome 1 0 likely formed by reverse transcription 
and integration in a retroviral fashion of an HML-2 proviral 
transcript from which only an intron within env had been 
removed. That locus was very likely formed relatively recently 
in human evolution as it is not found in non-human primates. 
Moreover, both loci harbor open reading frames for HML-2 
Rec and Env-like proteins. It is currently not known whether 
any of those proteins is expressed, so that they must be 
deemed hypothetical at this point. However, we note that 
according to dbSNP 137, there is only one synonymous SNP 
known for the chromosome 2q32.1 locus (rs2099848), and 
two nonsynonymous SNPs (rs61 870457 and rs963371 1) and 
one synonymous SNP (rs9633710) for the chromosome 
10q24.2 locus. However, there are no nonsense sequence 
variations known. Of further note, the retrotransposed 
HML-2 rec locus in 2q32.1 was not identified in our initial 
BLAT search for compiling HML-2 sequences from the 
human reference genome nor was it included in a recent 
study that described full-length and partial HML-2 loci in the 
human genome (Subramanian et al. 201 1). Even a spliced rec 
mRNA sequence as BLAT query did not identify the 2q32.1 
locus. More specific analysis may thus identify other retrotran- 
sposed HML-2 transcripts in the human (or other Old World 
primate) genomes. 

Another type II locus in 10q24.2 (ERVK-17) found tran- 
scribed in melanocytes Benno might represent another 
HML-2 mRNA splice variant that was reverse transcribed and 
reintegrated into the genome yet rather in a retroviral fashion. 
This locus harbors a 1,372 bp deletion within the env gene 
that is similar to the intronic portion spliced from rec mRNA, 
except that a different splice donor site somewhat down- 
stream from the actual rec splice donor site had been used. 
In contrast to a protein hypothetically encoded by the 2q32.1 
locus, which is similar to previously described Rec and Env 
protein sequences only for its N-terminus, the 10q24.2 locus 
hypothetically encodes a chimeric protein consisting of Env 
and Rec protein portions. In the light of the HML-2 Np9 pro- 
tein being an HML-2 protein variant resulting from deletion of 
a 292 bp sequence within the HML-2 pol-env region 
(Armbruester et al. 2002), it is conceivable that those proteins 
are translated and of biological significance. 

Quantitation by qRT-PCR revealed upregulated HML-2 gag 
transcript levels in about half of the melanoma samples. 
Melanoma cell line WM3734a, three melanoma RNA samples, 
and one melanoma lymph node metastasis showed upregu- 
lated HML-2 transcript levels compared with melanocytes 
Benno and Oskar (no qRT-PCR results of sufficient quality 
could be obtained for the second lymph node metastasis for 
unknown reasons, which was thus omitted from analysis). 
Although relative transcript levels in these melanoma 
samples were significantly lower than in GCT cell lines 
NCCIT and Tera-1 that display strongly upregulated HML-2 
transcription (Florl et al. 1999; Ruprecht, Mayer, et al. 
2008), still markedly elevated HML-2 transcription levels in 



Genome Biol. Evol. 5(2):307-328. doi:10.1093/gbe/evt010 Advance Access publication January 21, 2013 



325 



Schmitt et al. 



GBE 



melanoma and other cancers might suffice for detectable ex- 
pression of HML-2 proteins (Buscher et al. 2005; Reiche et al. 
2010). 

A previous study reported upregulated levels of HML-2 
RNA and proteins in melanoma cell lines following irradiation 
with UVB (Schanab et al. 201 1). Following irradiation of cells 
with the same dose of UVB, we observed changing HML-2 
transcription profiles for each of the investigated melanoma 
and melanocyte cell lines. Apart from the ERVK-14 locus in 
7q22.1 dominantly transcribed before UVB treatment but dis- 
playing reduced transcript levels afterward in five of the mel- 
anoma cell lines and both melanocyte cell lines, no specific 
(de)activation of HML-2 loci became obvious, including mela- 
noma cell lines versus melanocytes. 

We note that our quantitation of HML-2 transcript levels 
produced results in conflict with the study by Schanab et al. 
(2011) that reported activation of HML-2 transcription as a 
specific response to UVB in melanoma. That study, employing 
a qRT-PCR amplicon located within pol, reported upregulated 
HML-2 transcription in melanoma and lower transcription in 
nonmelanoma cells. In contrast, we found significantly low- 
ered HML-2 transcription in melanoma and melanocyte cell 
lines. Our results are based on a different qRT-PCR amplicon 
located in gag. However, when comparing the loci recognized 
by our primers located in gag and those recognized by the pol 
primers employed in Schanab et al., both primer pairs should 
be able to amplify about the same HML-2 loci. That is, differ- 
ences in measured transcript levels are not due to selective 
amplification of deregulated HML-2 loci by one or the other 
primer pair. In the same line, we exclude that the Schanab 
et al. primers amplified partially deleted HML-2 loci that were 
not amplifiable in our study but that were (strongly) 
upregulated by UVB. At present, we do not have a good ex- 
planation for the observed differences. Additional studies, 
such as analysis of HML-2 transcripts amplified by the 
Schanab et al. primers, will be required to potentially explain 
this discrepancy. 

We revealed in this study HML-2 loci transcribed in mela- 
noma. The identified HML-2 loci are of immediate relevance if 
HML-2 transcript or encoded proteins are considered to play a 
role in melanoma. The presented data reveal rather complex 
transcription profiles of HML-2 loci in melanoma in that vari- 
ous HML-2 loci display variable transcriptional activities in dif- 
ferent melanoma-derived samples. CpG methylation, histone 
modifications, and transcription factors very likely regulate in 
concert transcription of HML-2 LTRs. The epigenetic status of 
HML-2 loci may be different in every melanoma sample inves- 
tigated in our study, resulting in different transcriptional activ- 
ities of HML-2 LTRs. However, although changes in 
methylation and chromatin status are often observed in 
cancer cells, only specific and detailed epigenetic studies will 
be able to explain (in)activity of HML-2 loci in the various mel- 
anoma samples. As for a biological role of Rec and Np9 pro- 
teins in melanoma, our data indicate transcription of Rec and 



Np9 encoding loci both in melanoma and normal melano- 
cytes. Provided that the proteins are also found at correspond- 
ing levels, our findings imply that Rec and/or Np9 may rather 
have indirect effects during melanoma development. Detailed 
molecular and cell biology studies will be required to elucidate 
the role of those proteins in melanoma development. 
However, our study provides important information as to 
which Rec and Np9 protein sequences from which HML-2 
loci should be considered in such studies. Our study further- 
more points toward additional studies regarding the ERVK-6 
locus as it may represent a biomarker for melanoma. Last but 
not least, our study revealed, for the first time, HML-2 loci that 
were generated by reverse transcription of spliced 
HML-2 RNA. Moreover, both loci are transcribed and 
potentially encode Rec- and Env-like proteins, the function 
and relevance of which is currently unknown. The data pre- 
sented here and previous investigations demonstrate that 
transcription of HERV loci is complex. Many more specific 
studies are required to further elucidate when and how 
HERV loci are transcribed and how transcribed HERV loci 
could be involved in disease. 

Supplementary Material 

Supplementary figures SI -S3 and tables SI and S2 are avail- 
able at Genome Biology and Evolution online (http://www. 
gbe.oxfordjournals.org/). 
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